Code someone needs to run to reproduce your results
Don’t hard code anything about workflow into products
Openness as a spectrum
Varying degrees of openness
Practice peer-review
It’s okay to share imperfect code
Tools for Openness
Use the tools we have to make your work as open as possible
Google Drive - good for sharing data among collaborators
Use R code to pull directly from Google drive into R or onto personal computer
Github repositories - can keep private until published
Open science community
Openscapes training
NMFS R User Group
I don’t know about you but I’m feeling automated workflows
Save time and energy
Can put your energy towards developing models and not the tedious details
Make things uniform
Makes onboarding easier
Help future us
Helps you avoid mistakes and easier to fix them
Automate as much as possible!
Why SAP should adopt an open and automated workflow
Anyone can download a repository and reproduce our results
Makes it easier during WPSAR reviews to run reviewer requests
Easier onboarding when joining new projects
Helps share the responsibility among the team
Share knowledge within the group
Example: American Samoa Bottomfish Assessment
Raw data
Processed data
Model inputs
Code
Summary reports
Figures and tables
Github repository
Step 01: Retrieve raw data
library(googledrive)library(googlesheets4)########## DOWNLOAD DATA FROM GOOGLE DRIVE ################ Check latest data from Google Drive but only download if its more recent than on local repoa <-drive_reveal(drive_ls(path="https://drive.google.com/drive/u/1/folders/1pnH38cupmDU4O_KkKDhYWee_p4sTSD6u",pattern="Data"),what ="modified_time")a <-arrange(a, by =desc(modified_time))[1,] # Select most recent "Data" zip fileif(dir.exists(file.path(here(..=1),"Data"))){ Date.CurrentFolder <-as_datetime(file.info(paste0(file.path(here(..=1)),"/Data"))$mtime)} else { Date.CurrentFolder <-"1900-01-01 01:01:01 UTC" }Date.GoogleFolder <-as_datetime(map_chr(a$drive_resource, "modifiedTime"))if(Date.CurrentFolder<Date.GoogleFolder){drive_download(file=a$id, overwrite =TRUE, path =file.path(here(..=1),a$name))unzip(file.path(here(..=1),a$name),exdir=here(..=1)) }
Step 02: Process data
########## PROCESS CATCH, CPUE, AND SIZE DATA ################set.seed(123) source(paste0(here(..=1),"/Scripts/01_Data scripts/01_CPUE_BBS_InitPrep.r")); rm(list=ls())source(paste0(here(..=1),"/Scripts/01_Data scripts/02_CPUE_BBS_PropTable.r")); rm(list=ls()) source(paste0(here(..=1),"/Scripts/01_Data scripts/03a_CPUE_BBS_Wind.r")); rm(list=ls())source(paste0(here(..=1),"/Scripts/01_Data scripts/03b_CPUE_BBS_PCA.r")); rm(list=ls()) source(paste0(here(..=1),"/Scripts/01_Data scripts/04_CPUE_BBS_FinalPrep.r")); rm(list=ls())source(paste0(here(..=1),"/Scripts/01_Data scripts/06_CATCH_BBS_FinalPrep.r")); rm(list=ls()) set.seed(123)source(paste0(here(..=1),"/Scripts/01_Data scripts/07_CATCH_SBS_PropTable.r")); rm(list=ls()) source(paste0(here(..=1),"/Scripts/01_Data scripts/08_CATCH_SBS_FinalPrep.r")); rm(list=ls())source(paste0(here(..=1),"/Scripts/01_Data scripts/09_CATCH_Final.r")); rm(list=ls())source(paste0(here(..=1),"/Scripts/01_Data scripts/10_SIZE.r")); rm(list=ls()) ################ RUN CPUE STANDARDIZATION############################# Run CPUE standardization and export indices for input into SSsource(paste0(here(..=1),"/Scripts/01_Data scripts/05_CPUE_BBS_Standardize_Function3.r"))source(paste0(here(..=1),"/Scripts/01_Data scripts/05_CPUE_BBS_Standardize_Function2.r"))# Run CPUE standardization for all species, areas combined in a looproot_dir <- root_dir <- this.path::here(.. =1) Species.List <-c("APRU","APVI","CALU","ETCO","LERU","LUKA","PRFL","PRZO","VALO")for(i in1:length(Species.List)){Standardize_CPUE3(Sp=Species.List[i],Interaction=T,minYr=1988,maxYr=2015)}
Step 03: Set input arguments
Arguments we rarely changed
Lt <-vector("list",9) # Species optionsLt[[1]]<-list("APRU", #Name"SW_Then", #M "SW_BBS_BIOS", #Growth"Kamikawa", #Length-Weight"SW_BBS_BIOS", #Maturity F, #Use InitF?c(0.5,1.6), #Range of R0 prof.0.29, #Btarg. value T, #Include superyears?list(c(2019,2020)),#Blocks of superyearsc(2.5,5.5,0.2)) #Proj catch range
Arguments we changed frequently
DirName <-"65_Base"# Name of directoryrunmodels <- T # Run ss.exeprintreport<- T # Run ss_diags reportCreate_species_report_figs <- F # Produce formatted #figures and tables #word documentN_boot <-0# Bootstrap on/offN_foreyrs <-0# Nyears for forecastRD <- F # Run diagnosticsProfRes <- .1# R0 profile resolutionprofile <-"SR_LN(R0)"# Parameter to profile Begin <-c(1967,1986)[1] #Model start yearDeleteForecastFiles <- T #Remove extra files
Step 04: Build models and run
The core function is Build_All_SS()
Wrapper function for modular functions:
Build_Data()
Build_Control()
Build_Starter()
Build_Forecast()
Incorporate parameter information stored in Google Sheets